AITopics | sentiment classification reveal line attractor

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

Neural Information Processing SystemsDec-26-2025, 01:07:20 GMT

Recurrent neural networks (RNNs) are a widely used tool for modeling sequential data, yet they are often treated as inscrutable black boxes. Given a trained recurrent network, we would like to reverse engineer it--to obtain a quantitative, interpretable description of how it solves a particular task. Even for simple tasks, a detailed understanding of how recurrent networks work, or a prescription for how to develop such an understanding, remains elusive. In this work, we use tools from dynamical systems analysis to reverse engineer recurrent networks trained to perform sentiment classification, a foundational natural language processing task. Given a trained network, we find fixed points of the recurrent dynamics and linearize the nonlinear system around these fixed points. Despite their theoretical capacity to implement complex, high-dimensional computations, we find that trained networks converge to highly interpretable, low-dimensional representations. In particular, the topological structure of the fixed points and corresponding linearized dynamics reveal an approximate line attractor within the RNN, which we can use to quantitatively understand how the RNN solves the sentiment analysis task. Finally, we find this mechanism present across RNN architectures (including LSTMs, GRUs, and vanilla RNNs) trained on multiple datasets, suggesting that our findings are not unique to a particular architecture or dataset. Overall, these results demonstrate that surprisingly universal and human interpretable computations can arise across a range of recurrent networks.

engineering recurrent network, recurrent network, sentiment classification reveal line attractor, (8 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.59)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Add feedback

Reviews: Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

Neural Information Processing SystemsJan-27-2025, 10:57:28 GMT

UPDATE after reading author rebuttal: Look forward to the changes in the final version of the paper. Detailed comments: 1. Understanding of RNNs for sentiment classification task - theoretical analysis backed by empirical observations: This work takes up the sentiment classification task. This work figured out some fixed points and centered their analysis of RNNs around them. The RNN states can be cast into a 1-dimensional manifold of these fixed points. The PCA of RNN states across examples reveal that training helps RNNs figure out a lower-dimensional representation. Interestingly the movement along this low dimensional manifold is minimal in absence of inputs or presence of neutral/un-informative words, whereas they show more movements if polarity bearing words are present, thus, showing linear separability effects along this 1-D manifold.

classification reveal line attractor dynamic, engineering recurrent network, sentiment classification reveal line attractor, (7 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.86)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.86)

Add feedback

Reviews: Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

Neural Information Processing SystemsJan-27-2025, 10:57:17 GMT

This paper provides insightful analysis into what decision processes are actually implemented by a trained recurrent network for sentiment classification, and uncover simple line attractor dynamics. All reviewers agree that this is interesting and illuminating, and that this work shows a good example of what can be done to open the black box of deep systems.

classification reveal line attractor dynamic, line attractor dynamic, sentiment classification reveal line attractor, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.78)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.78)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.78)

Add feedback

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

Neural Information Processing SystemsOct-11-2024, 01:42:57 GMT

Recurrent neural networks (RNNs) are a widely used tool for modeling sequential data, yet they are often treated as inscrutable black boxes. Given a trained recurrent network, we would like to reverse engineer it--to obtain a quantitative, interpretable description of how it solves a particular task. Even for simple tasks, a detailed understanding of how recurrent networks work, or a prescription for how to develop such an understanding, remains elusive. In this work, we use tools from dynamical systems analysis to reverse engineer recurrent networks trained to perform sentiment classification, a foundational natural language processing task. Given a trained network, we find fixed points of the recurrent dynamics and linearize the nonlinear system around these fixed points.

classification reveal line attractor dynamic, recurrent network, sentiment classification reveal line attractor, (5 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

Maheswaranathan, Niru, Williams, Alex, Golub, Matthew, Ganguli, Surya, Sussillo, David

Neural Information Processing SystemsMar-19-2020, 03:04:22 GMT

Recurrent neural networks (RNNs) are a widely used tool for modeling sequential data, yet they are often treated as inscrutable black boxes. Given a trained recurrent network, we would like to reverse engineer it--to obtain a quantitative, interpretable description of how it solves a particular task. Even for simple tasks, a detailed understanding of how recurrent networks work, or a prescription for how to develop such an understanding, remains elusive. In this work, we use tools from dynamical systems analysis to reverse engineer recurrent networks trained to perform sentiment classification, a foundational natural language processing task. Given a trained network, we find fixed points of the recurrent dynamics and linearize the nonlinear system around these fixed points.

classification reveal line attractor dynamic, recurrent network, sentiment classification reveal line attractor, (5 more...)

Neural Information Processing Systems

Genre: Research Report (0.39)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Filters

Collaborating Authors

sentiment classification reveal line attractor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

Reviews: Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

Reviews: Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics

Reverse engineering recurrent networks for sentiment classification reveals line attractor dynamics